Overview

Dataset statistics

Number of variables23
Number of observations3900
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.3 MiB
Average record size in memory356.8 B

Variable types

NUM19
CAT3
BOOL1

Reproduction

Analysis started2020-05-10 01:03:10.317732
Analysis finished2020-05-10 01:03:58.755557
Versionpandas-profiling v2.6.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml
Time_Room_Service is highly correlated with Deposit_KeptHigh Correlation
Deposit_Kept is highly correlated with Time_Room_ServiceHigh Correlation
Room has 146 (3.7%) zeros Zeros
Check-in/Check-out has 208 (5.3%) zeros Zeros
F&B has 173 (4.4%) zeros Zeros
Entertainment has 84 (2.2%) zeros Zeros
Deposit_Kept has 2167 (55.6%) zeros Zeros
Time_Room_Service has 2158 (55.3%) zeros Zeros

Variables

Guest_ID
Real number (ℝ≥0)

UNIQUE
Distinct count3900
Unique (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16412.674871794872
Minimum10007
Maximum22996
Zeros0
Zeros (%)0.0%
Memory size30.6 KiB

Quantile statistics

Minimum10007
5-th percentile10610.9
Q113171.5
median16431.5
Q319648.25
95-th percentile22273.25
Maximum22996
Range12989
Interquartile range (IQR)6476.75

Descriptive statistics

Standard deviation3725.011331
Coefficient of variation (CV)0.2269594298
Kurtosis-1.178374853
Mean16412.67487
Median Absolute Deviation (MAD)3216.357923
Skewness0.003653074544
Sum64009432
Variance13875709.42
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[10007. 22996.], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
10239 1 < 0.1%
 
21111 1 < 0.1%
 
18536 1 < 0.1%
 
10932 1 < 0.1%
 
17069 1 < 0.1%
 
10924 1 < 0.1%
 
12971 1 < 0.1%
 
17065 1 < 0.1%
 
12967 1 < 0.1%
 
17061 1 < 0.1%
 
Other values (3890) 3890 99.7%
 
ValueCountFrequency (%) 
10007 1 < 0.1%
 
10009 1 < 0.1%
 
10015 1 < 0.1%
 
10016 1 < 0.1%
 
10017 1 < 0.1%
 
ValueCountFrequency (%) 
22996 1 < 0.1%
 
22995 1 < 0.1%
 
22993 1 < 0.1%
 
22990 1 < 0.1%
 
22987 1 < 0.1%
 

Gender
Categorical

Distinct count2
Unique (%)0.1%
Missing0
Missing (%)0.0%
Memory size30.6 KiB
Female
2032
Male
1868
ValueCountFrequency (%) 
Female 2032 52.1%
 
Male 1868 47.9%
 

Length

Max length6
Mean length5.042051282
Min length4
ValueCountFrequency (%) 
Lowercase_Letter 4 66.7%
 
Uppercase_Letter 2 33.3%
 
ValueCountFrequency (%) 
Latin 6 100.0%
 
ValueCountFrequency (%) 
ASCII 6 100.0%
 
Distinct count2
Unique (%)0.1%
Missing0
Missing (%)0.0%
Memory size30.6 KiB
1
3200
0
700
ValueCountFrequency (%) 
1 3200 82.1%
 
0 700 17.9%
 

Age
Real number (ℝ≥0)

Distinct count73
Unique (%)1.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean39.91692307692308
Minimum7
Maximum80
Zeros0
Zeros (%)0.0%
Memory size30.6 KiB

Quantile statistics

Minimum7
5-th percentile15
Q127
median41
Q352
95-th percentile64
Maximum80
Range73
Interquartile range (IQR)25

Descriptive statistics

Standard deviation15.26546393
Coefficient of variation (CV)0.3824308778
Kurtosis-0.7517269127
Mean39.91692308
Median Absolute Deviation (MAD)12.74626746
Skewness-0.03040519735
Sum155676
Variance233.0343891
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 7. 13.5 19.5 21.5 26.5 37.5 54.5 60.5 70.5 80. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
45 114 2.9%
 
52 111 2.8%
 
39 107 2.7%
 
23 98 2.5%
 
40 98 2.5%
 
25 96 2.5%
 
24 96 2.5%
 
41 93 2.4%
 
22 92 2.4%
 
46 88 2.3%
 
Other values (63) 2907 74.5%
 
ValueCountFrequency (%) 
7 22 0.6%
 
8 20 0.5%
 
9 14 0.4%
 
10 22 0.6%
 
11 28 0.7%
 
ValueCountFrequency (%) 
80 6 0.2%
 
79 1 < 0.1%
 
78 2 0.1%
 
77 5 0.1%
 
76 4 0.1%
 

Type
Categorical

Distinct count2
Unique (%)0.1%
Missing0
Missing (%)0.0%
Memory size30.6 KiB
Business travel
2695
Personal Travel
1205
ValueCountFrequency (%) 
Business travel 2695 69.1%
 
Personal Travel 1205 30.9%
 

Length

Max length15
Mean length15
Min length15
ValueCountFrequency (%) 
Lowercase_Letter 11 73.3%
 
Uppercase_Letter 3 20.0%
 
Space_Separator 1 6.7%
 
ValueCountFrequency (%) 
Latin 14 93.3%
 
Common 1 6.7%
 
ValueCountFrequency (%) 
ASCII 15 100.0%
 

Flight_Class
Categorical

Distinct count3
Unique (%)0.1%
Missing0
Missing (%)0.0%
Memory size30.6 KiB
Business
1849
Eco
1778
Eco Plus
 
273
ValueCountFrequency (%) 
Business 1849 47.4%
 
Eco 1778 45.6%
 
Eco Plus 273 7.0%
 

Length

Max length8
Mean length5.720512821
Min length3
ValueCountFrequency (%) 
Lowercase_Letter 8 66.7%
 
Uppercase_Letter 3 25.0%
 
Space_Separator 1 8.3%
 
ValueCountFrequency (%) 
Latin 11 91.7%
 
Common 1 8.3%
 
ValueCountFrequency (%) 
ASCII 12 100.0%
 

Points
Real number (ℝ≥0)

Distinct count2358
Unique (%)60.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1964.1866666666667
Minimum50
Maximum6537
Zeros0
Zeros (%)0.0%
Memory size30.6 KiB

Quantile statistics

Minimum50
5-th percentile344.95
Q11332
median1881.5
Q32531
95-th percentile3796
Maximum6537
Range6487
Interquartile range (IQR)1199

Descriptive statistics

Standard deviation1021.563774
Coefficient of variation (CV)0.5200950559
Kurtosis0.3189905056
Mean1964.186667
Median Absolute Deviation (MAD)791.3708034
Skewness0.4855762506
Sum7660328
Variance1043592.545
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 50. 133.5 202. 702. 868. ... 2743.5 3913. 4035. 5174. 6537. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
1487 7 0.2%
 
1874 7 0.2%
 
1653 7 0.2%
 
1805 6 0.2%
 
1765 6 0.2%
 
1853 6 0.2%
 
1812 6 0.2%
 
1940 6 0.2%
 
2374 6 0.2%
 
1622 5 0.1%
 
Other values (2348) 3838 98.4%
 
ValueCountFrequency (%) 
50 1 < 0.1%
 
55 2 0.1%
 
58 1 < 0.1%
 
62 1 < 0.1%
 
63 1 < 0.1%
 
ValueCountFrequency (%) 
6537 1 < 0.1%
 
5816 1 < 0.1%
 
5776 1 < 0.1%
 
5722 1 < 0.1%
 
5693 1 < 0.1%
 

Room
Real number (ℝ≥0)

ZEROS
Distinct count6
Unique (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.8482051282051284
Minimum0
Maximum5
Zeros146
Zeros (%)3.7%
Memory size30.6 KiB

Quantile statistics

Minimum0
5-th percentile1
Q12
median3
Q34
95-th percentile5
Maximum5
Range5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.396156509
Coefficient of variation (CV)0.4901881873
Kurtosis-0.9458255092
Mean2.848205128
Median Absolute Deviation (MAD)1.177744116
Skewness-0.09536277055
Sum11108
Variance1.949252997
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0. 0.5 1.5 4.5 5. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
2 886 22.7%
 
4 862 22.1%
 
3 849 21.8%
 
1 611 15.7%
 
5 546 14.0%
 
0 146 3.7%
 
ValueCountFrequency (%) 
0 146 3.7%
 
1 611 15.7%
 
2 886 22.7%
 
3 849 21.8%
 
4 862 22.1%
 
ValueCountFrequency (%) 
5 546 14.0%
 
4 862 22.1%
 
3 849 21.8%
 
2 886 22.7%
 
1 611 15.7%
 

Check-in/Check-out
Real number (ℝ≥0)

ZEROS
Distinct count6
Unique (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.9956410256410257
Minimum0
Maximum5
Zeros208
Zeros (%)5.3%
Memory size30.6 KiB

Quantile statistics

Minimum0
5-th percentile0
Q12
median3
Q34
95-th percentile5
Maximum5
Range5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.519719563
Coefficient of variation (CV)0.5073103052
Kurtosis-1.049238343
Mean2.995641026
Median Absolute Deviation (MAD)1.278756345
Skewness-0.2843441178
Sum11683
Variance2.30954755
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0. 0.5 2.5 3.5 4.5 5. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
4 929 23.8%
 
5 777 19.9%
 
3 720 18.5%
 
2 656 16.8%
 
1 610 15.6%
 
0 208 5.3%
 
ValueCountFrequency (%) 
0 208 5.3%
 
1 610 15.6%
 
2 656 16.8%
 
3 720 18.5%
 
4 929 23.8%
 
ValueCountFrequency (%) 
5 777 19.9%
 
4 929 23.8%
 
3 720 18.5%
 
2 656 16.8%
 
1 610 15.6%
 

F&B
Real number (ℝ≥0)

ZEROS
Distinct count6
Unique (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.856923076923077
Minimum0
Maximum5
Zeros173
Zeros (%)4.4%
Memory size30.6 KiB

Quantile statistics

Minimum0
5-th percentile1
Q12
median3
Q34
95-th percentile5
Maximum5
Range5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.442597743
Coefficient of variation (CV)0.5049480521
Kurtosis-0.9986324122
Mean2.856923077
Median Absolute Deviation (MAD)1.220109665
Skewness-0.1208369828
Sum11142
Variance2.081088247
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0. 0.5 1.5 4.5 5. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
4 842 21.6%
 
3 825 21.2%
 
2 815 20.9%
 
1 639 16.4%
 
5 606 15.5%
 
0 173 4.4%
 
ValueCountFrequency (%) 
0 173 4.4%
 
1 639 16.4%
 
2 815 20.9%
 
3 825 21.2%
 
4 842 21.6%
 
ValueCountFrequency (%) 
5 606 15.5%
 
4 842 21.6%
 
3 825 21.2%
 
2 815 20.9%
 
1 639 16.4%
 

Location
Real number (ℝ≥0)

Distinct count5
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.012051282051282
Minimum1
Maximum5
Zeros0
Zeros (%)0.0%
Memory size30.6 KiB

Quantile statistics

Minimum1
5-th percentile1
Q12
median3
Q34
95-th percentile5
Maximum5
Range4
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.298136608
Coefficient of variation (CV)0.4309809119
Kurtosis-1.071411
Mean3.012051282
Median Absolute Deviation (MAD)1.057353189
Skewness-0.08076922033
Sum11747
Variance1.685158653
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[1. 1.5 2.5 4.5 5. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
3 1016 26.1%
 
4 934 23.9%
 
2 721 18.5%
 
1 656 16.8%
 
5 573 14.7%
 
ValueCountFrequency (%) 
1 656 16.8%
 
2 721 18.5%
 
3 1016 26.1%
 
4 934 23.9%
 
5 573 14.7%
 
ValueCountFrequency (%) 
5 573 14.7%
 
4 934 23.9%
 
3 1016 26.1%
 
2 721 18.5%
 
1 656 16.8%
 

Wifi
Real number (ℝ≥0)

Distinct count6
Unique (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.2666666666666666
Minimum0
Maximum5
Zeros6
Zeros (%)0.2%
Memory size30.6 KiB

Quantile statistics

Minimum0
5-th percentile1
Q12
median3
Q34
95-th percentile5
Maximum5
Range5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.319518929
Coefficient of variation (CV)0.4039343661
Kurtosis-1.127038831
Mean3.266666667
Median Absolute Deviation (MAD)1.146974359
Skewness-0.1890824049
Sum12740
Variance1.741130204
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0. 0.5 1.5 3.5 4.5 5. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
4 925 23.7%
 
5 899 23.1%
 
2 839 21.5%
 
3 818 21.0%
 
1 413 10.6%
 
0 6 0.2%
 
ValueCountFrequency (%) 
0 6 0.2%
 
1 413 10.6%
 
2 839 21.5%
 
3 818 21.0%
 
4 925 23.7%
 
ValueCountFrequency (%) 
5 899 23.1%
 
4 925 23.7%
 
3 818 21.0%
 
2 839 21.5%
 
1 413 10.6%
 

Entertainment
Real number (ℝ≥0)

ZEROS
Distinct count6
Unique (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.384102564102564
Minimum0
Maximum5
Zeros84
Zeros (%)2.2%
Memory size30.6 KiB

Quantile statistics

Minimum0
5-th percentile1
Q12
median4
Q34
95-th percentile5
Maximum5
Range5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.335357202
Coefficient of variation (CV)0.3945971425
Kurtosis-0.5402427337
Mean3.384102564
Median Absolute Deviation (MAD)1.133547666
Skewness-0.5970774815
Sum13198
Variance1.783178856
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0. 0.5 1.5 3.5 4.5 5. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
4 1288 33.0%
 
5 877 22.5%
 
3 701 18.0%
 
2 608 15.6%
 
1 342 8.8%
 
0 84 2.2%
 
ValueCountFrequency (%) 
0 84 2.2%
 
1 342 8.8%
 
2 608 15.6%
 
3 701 18.0%
 
4 1288 33.0%
 
ValueCountFrequency (%) 
5 877 22.5%
 
4 1288 33.0%
 
3 701 18.0%
 
2 608 15.6%
 
1 342 8.8%
 

Gym
Real number (ℝ≥0)

Distinct count5
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.5264102564102564
Minimum1
Maximum5
Zeros0
Zeros (%)0.0%
Memory size30.6 KiB

Quantile statistics

Minimum1
5-th percentile1
Q13
median4
Q35
95-th percentile5
Maximum5
Range4
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.291584987
Coefficient of variation (CV)0.3662605576
Kurtosis-0.7997210228
Mean3.526410256
Median Absolute Deviation (MAD)1.104800131
Skewness-0.5678137875
Sum13753
Variance1.668191778
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[1. 2.5 3.5 4.5 5. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
4 1257 32.2%
 
5 1058 27.1%
 
3 653 16.7%
 
2 544 13.9%
 
1 388 9.9%
 
ValueCountFrequency (%) 
1 388 9.9%
 
2 544 13.9%
 
3 653 16.7%
 
4 1257 32.2%
 
5 1058 27.1%
 
ValueCountFrequency (%) 
5 1058 27.1%
 
4 1257 32.2%
 
3 653 16.7%
 
2 544 13.9%
 
1 388 9.9%
 

Spa
Real number (ℝ≥0)

Distinct count6
Unique (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.4812820512820513
Minimum0
Maximum5
Zeros2
Zeros (%)0.1%
Memory size30.6 KiB

Quantile statistics

Minimum0
5-th percentile1
Q12
median4
Q35
95-th percentile5
Maximum5
Range5
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.292116447
Coefficient of variation (CV)0.3711610917
Kurtosis-0.8956475122
Mean3.481282051
Median Absolute Deviation (MAD)1.116219855
Skewness-0.4853974402
Sum13577
Variance1.669564911
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0. 0.5 1.5 3.5 4.5 5. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
4 1201 30.8%
 
5 1023 26.2%
 
3 679 17.4%
 
2 626 16.1%
 
1 369 9.5%
 
0 2 0.1%
 
ValueCountFrequency (%) 
0 2 0.1%
 
1 369 9.5%
 
2 626 16.1%
 
3 679 17.4%
 
4 1201 30.8%
 
ValueCountFrequency (%) 
5 1023 26.2%
 
4 1201 30.8%
 
3 679 17.4%
 
2 626 16.1%
 
1 369 9.5%
 

Staff
Real number (ℝ≥0)

Distinct count6
Unique (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.483076923076923
Minimum0
Maximum5
Zeros1
Zeros (%)< 0.1%
Memory size30.6 KiB

Quantile statistics

Minimum0
5-th percentile1
Q13
median4
Q34
95-th percentile5
Maximum5
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.257333765
Coefficient of variation (CV)0.3609836339
Kurtosis-0.7329475055
Mean3.483076923
Median Absolute Deviation (MAD)1.070253254
Skewness-0.5198169209
Sum13584
Variance1.580888196
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0. 1.5 2.5 3.5 4.5 5. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
4 1229 31.5%
 
5 957 24.5%
 
3 833 21.4%
 
2 504 12.9%
 
1 376 9.6%
 
0 1 < 0.1%
 
ValueCountFrequency (%) 
0 1 < 0.1%
 
1 376 9.6%
 
2 504 12.9%
 
3 833 21.4%
 
4 1229 31.5%
 
ValueCountFrequency (%) 
5 957 24.5%
 
4 1229 31.5%
 
3 833 21.4%
 
2 504 12.9%
 
1 376 9.6%
 

Pool
Real number (ℝ≥0)

Distinct count6
Unique (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.4951282051282053
Minimum0
Maximum5
Zeros10
Zeros (%)0.3%
Memory size30.6 KiB

Quantile statistics

Minimum0
5-th percentile1
Q12
median4
Q35
95-th percentile5
Maximum5
Range5
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.282480788
Coefficient of variation (CV)0.3669338328
Kurtosis-0.8390520795
Mean3.495128205
Median Absolute Deviation (MAD)1.106606969
Skewness-0.4968268433
Sum13631
Variance1.644756973
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0. 0.5 1.5 3.5 4.5 5. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
4 1204 30.9%
 
5 1030 26.4%
 
3 679 17.4%
 
2 651 16.7%
 
1 326 8.4%
 
0 10 0.3%
 
ValueCountFrequency (%) 
0 10 0.3%
 
1 326 8.4%
 
2 651 16.7%
 
3 679 17.4%
 
4 1204 30.9%
 
ValueCountFrequency (%) 
5 1030 26.4%
 
4 1204 30.9%
 
3 679 17.4%
 
2 651 16.7%
 
1 326 8.4%
 

Baggage_Handling
Real number (ℝ≥0)

Distinct count5
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.6956410256410255
Minimum1
Maximum5
Zeros0
Zeros (%)0.0%
Memory size30.6 KiB

Quantile statistics

Minimum1
5-th percentile1
Q13
median4
Q35
95-th percentile5
Maximum5
Range4
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.142986676
Coefficient of variation (CV)0.3092796807
Kurtosis-0.2663876412
Mean3.695641026
Median Absolute Deviation (MAD)0.9367589744
Skewness-0.7097484223
Sum14413
Variance1.306418543
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[1. 2.5 3.5 4.5 5. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
4 1429 36.6%
 
5 1067 27.4%
 
3 771 19.8%
 
2 416 10.7%
 
1 217 5.6%
 
ValueCountFrequency (%) 
1 217 5.6%
 
2 416 10.7%
 
3 771 19.8%
 
4 1429 36.6%
 
5 1067 27.4%
 
ValueCountFrequency (%) 
5 1067 27.4%
 
4 1429 36.6%
 
3 771 19.8%
 
2 416 10.7%
 
1 217 5.6%
 

Reception
Real number (ℝ≥0)

Distinct count5
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.337948717948718
Minimum1
Maximum5
Zeros0
Zeros (%)0.0%
Memory size30.6 KiB

Quantile statistics

Minimum1
5-th percentile1
Q13
median3
Q34
95-th percentile5
Maximum5
Range4
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.245312688
Coefficient of variation (CV)0.3730772379
Kurtosis-0.7437995727
Mean3.337948718
Median Absolute Deviation (MAD)1.042034188
Skewness-0.3947413292
Sum13018
Variance1.550803691
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[1. 1.5 2.5 4.5 5. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
3 1116 28.6%
 
4 1101 28.2%
 
5 784 20.1%
 
1 452 11.6%
 
2 447 11.5%
 
ValueCountFrequency (%) 
1 452 11.6%
 
2 447 11.5%
 
3 1116 28.6%
 
4 1101 28.2%
 
5 784 20.1%
 
ValueCountFrequency (%) 
5 784 20.1%
 
4 1101 28.2%
 
3 1116 28.6%
 
2 447 11.5%
 
1 452 11.6%
 

Cleanliness
Real number (ℝ≥0)

Distinct count6
Unique (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.697948717948718
Minimum0
Maximum5
Zeros1
Zeros (%)< 0.1%
Memory size30.6 KiB

Quantile statistics

Minimum0
5-th percentile1
Q13
median4
Q35
95-th percentile5
Maximum5
Range5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.13402727
Coefficient of variation (CV)0.3066638713
Kurtosis-0.23456247
Mean3.697948718
Median Absolute Deviation (MAD)0.9271008547
Skewness-0.7148839475
Sum14422
Variance1.286017848
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0. 1.5 2.5 3.5 4.5 5. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
4 1459 37.4%
 
5 1050 26.9%
 
3 762 19.5%
 
2 422 10.8%
 
1 206 5.3%
 
0 1 < 0.1%
 
ValueCountFrequency (%) 
0 1 < 0.1%
 
1 206 5.3%
 
2 422 10.8%
 
3 762 19.5%
 
4 1459 37.4%
 
ValueCountFrequency (%) 
5 1050 26.9%
 
4 1459 37.4%
 
3 762 19.5%
 
2 422 10.8%
 
1 206 5.3%
 

Online_Booking
Real number (ℝ≥0)

Distinct count6
Unique (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.345128205128205
Minimum0
Maximum5
Zeros1
Zeros (%)< 0.1%
Memory size30.6 KiB

Quantile statistics

Minimum0
5-th percentile1
Q12
median3
Q34
95-th percentile5
Maximum5
Range5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.276924186
Coefficient of variation (CV)0.3817265312
Kurtosis-0.8834224102
Mean3.345128205
Median Absolute Deviation (MAD)1.086421565
Skewness-0.3656400321
Sum13046
Variance1.630535377
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0. 0.5 1.5 2.5 4.5 5. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
4 1082 27.7%
 
3 971 24.9%
 
5 852 21.8%
 
2 551 14.1%
 
1 443 11.4%
 
0 1 < 0.1%
 
ValueCountFrequency (%) 
0 1 < 0.1%
 
1 443 11.4%
 
2 551 14.1%
 
3 971 24.9%
 
4 1082 27.7%
 
ValueCountFrequency (%) 
5 852 21.8%
 
4 1082 27.7%
 
3 971 24.9%
 
2 551 14.1%
 
1 443 11.4%
 

Deposit_Kept
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS
Distinct count182
Unique (%)4.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14.550256410256411
Minimum0
Maximum569
Zeros2167
Zeros (%)55.6%
Memory size30.6 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q313
95-th percentile74.05
Maximum569
Range569
Interquartile range (IQR)13

Descriptive statistics

Standard deviation36.53200823
Coefficient of variation (CV)2.510746697
Kurtosis39.99143583
Mean14.55025641
Median Absolute Deviation (MAD)19.77595556
Skewness5.22328422
Sum56746
Variance1334.587625
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0.000e+00 5.000e-01 2.500e+00 7.500e+00 1.850e+01 ... 1.100e+02 1.595e+02 2.295e+02 3.545e+02 5.690e+02], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 2167 55.6%
 
1 124 3.2%
 
2 91 2.3%
 
3 73 1.9%
 
4 71 1.8%
 
5 68 1.7%
 
6 65 1.7%
 
7 60 1.5%
 
16 45 1.2%
 
10 44 1.1%
 
Other values (172) 1092 28.0%
 
ValueCountFrequency (%) 
0 2167 55.6%
 
1 124 3.2%
 
2 91 2.3%
 
3 73 1.9%
 
4 71 1.8%
 
ValueCountFrequency (%) 
569 1 < 0.1%
 
415 1 < 0.1%
 
358 1 < 0.1%
 
351 1 < 0.1%
 
341 1 < 0.1%
 

Time_Room_Service
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS
Distinct count190
Unique (%)4.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.4865641025641028
Minimum0.0
Maximum54.3
Zeros2158
Zeros (%)55.3%
Memory size30.6 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31.3
95-th percentile7.305
Maximum54.3
Range54.3
Interquartile range (IQR)1.3

Descriptive statistics

Standard deviation3.684135386
Coefficient of variation (CV)2.478288948
Kurtosis37.40153856
Mean1.486564103
Median Absolute Deviation (MAD)2.009098277
Skewness5.116669975
Sum5797.6
Variance13.57285354
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0.000e+00 5.000e-02 4.500e-01 1.150e+00 2.150e+00 ... 7.550e+00 1.045e+01 1.775e+01 3.545e+01 5.430e+01], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 2158 55.3%
 
0.1 100 2.6%
 
0.4 81 2.1%
 
0.2 80 2.1%
 
0.3 80 2.1%
 
0.5 64 1.6%
 
0.6 58 1.5%
 
0.7 56 1.4%
 
0.8 55 1.4%
 
0.9 52 1.3%
 
Other values (180) 1116 28.6%
 
ValueCountFrequency (%) 
0 2158 55.3%
 
0.1 100 2.6%
 
0.2 80 2.1%
 
0.3 80 2.1%
 
0.4 81 2.1%
 
ValueCountFrequency (%) 
54.3 1 < 0.1%
 
41 1 < 0.1%
 
35.7 1 < 0.1%
 
35.2 1 < 0.1%
 
34.9 1 < 0.1%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Missing values

Sample

First rows

Guest_IDGenderFrequent_TravelerAgeTypeFlight_ClassPointsRoomCheck-in/Check-outF&BLocationWifiEntertainmentGymSpaStaffPoolBaggage_HandlingReceptionCleanlinessOnline_BookingDeposit_KeptTime_Room_Service
019847Female038Business travelEco20973334311113314111213.5
112433Female146Business travelBusiness16293333254444454300.0
210273Male133Business travelBusiness16155555444453332400.0
312457Male038Business travelEco15203334232233445200.0
422903Female027Business travelBusiness352433342322435352100.0
522449Male137Personal TravelEco31922425121144455100.1
614787Male041Business travelBusiness15185552151153555100.0
715158Male132Business travelBusiness138855555545425455242.4
822185Female024Business travelBusiness17965253511435413100.0
914633Male131Business travelBusiness21842222222242544230.0

Last rows

Guest_IDGenderFrequent_TravelerAgeTypeFlight_ClassPointsRoomCheck-in/Check-outF&BLocationWifiEntertainmentGymSpaStaffPoolBaggage_HandlingReceptionCleanlinessOnline_BookingDeposit_KeptTime_Room_Service
389020327Male033Business travelEco214643435425441225403.0
389114785Male172Business travelBusiness38543222534333343300.0
389222341Female113Personal TravelEco34535452151114442100.0
389320010Female127Personal TravelEco17341111444443534450.0
389417139Male156Business travelEco1911355543442411147010.7
389518266Female134Business travelBusiness35325555333535535327330.8
389621243Female113Personal TravelEco17015515534444454200.0
389719539Female117Personal TravelEco16434543144152434103.2
389815253Male123Personal TravelEco27213233344223344415216.4
389922708Female152Business travelBusiness2180502554555545400.3